Search CORE

8 research outputs found

Streaming Graph Challenge: Stochastic Block Partition

Author: Gadepally Vijay
Hurley Michael
Jones Michael
Kao Edward
Kepner Jeremy
Mohindra Sanjeev
Monticciolo Paul
Reuther Albert
Samsi Siddharth
Smith Steven
Song William
Staheli Diane
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/08/2017
Field of study

An important objective for analyzing real-world graphs is to achieve scalable performance on large, streaming graphs. A challenging and relevant example is the graph partition problem. As a combinatorial problem, graph partition is NP-hard, but existing relaxation methods provide reasonable approximate solutions that can be scaled for large graphs. Competitive benchmarks and challenges have proven to be an effective means to advance state-of-the-art performance and foster community collaboration. This paper describes a graph partition challenge with a baseline partition algorithm of sub-quadratic complexity. The algorithm employs rigorous Bayesian inferential methods based on a statistical model that captures characteristics of the real-world graphs. This strong foundation enables the algorithm to address limitations of well-known graph partition approaches such as modularity maximization. This paper describes various aspects of the challenge including: (1) the data sets and streaming graph generator, (2) the baseline partition algorithm with pseudocode, (3) an argument for the correctness of parallelizing the Bayesian inference, (4) different parallel computation strategies such as node-based parallelism and matrix-based parallelism, (5) evaluation metrics for partition correctness and computational requirements, (6) preliminary timing of a Python-based demonstration code and the open source C++ code, and (7) considerations for partitioning the graph in streaming fashion. Data sets and source code for the algorithm as well as metrics, with detailed documentation are available at GraphChallenge.org.Comment: To be published in 2017 IEEE High Performance Extreme Computing Conference (HPEC

arXiv.org e-Print Archive

Crossref

VAST Challenge 2015: Mayhem at Dinofun World

Author: Cook Kristin
Crouser R. Jordan
Fallon John
Grinstein Georges
Liggett Kristen
Staheli Diane
Whiting Mark
Publication venue: Smith ScholarWorks
Publication date: 01/01/2015
Field of study

A fictitious amusement park and a larger-than-life hometown football hero provided participants in the VAST Challenge 2015 with an engaging yet complex storyline and setting in which to analyze movement and communication patterns. The datasets for the 2015 challenge were large—averaging nearly 10 million records per day over a three day period—with a simple straightforward structured format. The simplicity of the format belied a complex wealth of features contained in the data that needed to be discovered and understood to solve the tasks and questions that were posed. Two Mini-Challenges and a Grand Challenge compose the 2015 competition. Mini-Challenge 1 contained structured location and date-time data for park visitors, against which participants were to discern groups and their activities. MiniChallenge 2 contained structured communication data consisting of metadata about time-stamped text messages sent between park visitors. The Grand Challenge required participants to use both movement and communication data to hypothesize when a crime was committed and identify the most likely suspects from all the park visitors. The VAST Challenge 2015 received 74 submissions, and the datasets were downloaded, at least partially, from 26 countries

Smith College: Smith ScholarWorks

Visualization Evaluation for Cyber Security: Trends and Future Directions

Author: Crouser R. Jordan
Damodaran Suresh
Harrison Lane
McKenna Sean
Nam Kevin
O’Gwynn David
Staheli Diane
Yu Tamara
Publication venue: Smith ScholarWorks
Publication date: 01/01/2014
Field of study

The Visualization for Cyber Security research community (VizSec) addresses longstanding challenges in cyber security by adapting and evaluating information visualization techniques with application to the cyber security domain. This research effort has created many tools and techniques that could be applied to improve cyber security, yet the community has not yet established unified standards for evaluating these approaches to predict their operational validity. In this paper, we survey and categorize the evaluation metrics, components and techniques that have been utilized in the past decade of VizSec research literature. We also discuss existing methodological gaps in evaluating visualization in cyber security, and suggest potential avenues for future re- search in order to help establish an agenda for advancing the state-of-the-art in evaluating cyber security visualization

Smith College: Smith ScholarWorks

GraphChallenge.org: Raising the Bar on Graph Analytic Performance

Author: Gadepally Vijay
Hurley Michael
Jones Michael
Kao Edward
Kepner Jeremy
Mohindra Sanjeev
Monticciolo Paul
Reuther Albert
Samsi Siddharth
Smith Steven
Song William
Staheli Diane
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2018
Field of study

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of pre-parsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. Graph Challenge 2017 received 22 submissions by 111 authors from 36 organizations. The submissions highlighted graph analytic innovations in hardware, software, algorithms, systems, and visualization. These submissions produced many comparable performance measurements that can be used for assessing the current state of the art of the field. There were numerous submissions that implemented the triangle counting challenge and resulted in over 350 distinct measurements. Analysis of these submissions show that their execution time is a strong function of the number of edges in the graph,

N_e

, and is typically proportional to

N_e^{4/3}

for large values of

N_e

. Combining the model fits of the submissions presents a picture of the current state of the art of graph analysis, which is typically

10^8

edges processed per second for graphs with

10^8

edges. These results are

30

times faster than serial implementations commonly used by many graph analysts and underscore the importance of making these performance benefits available to the broader community. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs.Comment: 7 pages, 6 figures; submitted to IEEE HPEC Graph Challenge. arXiv admin note: text overlap with arXiv:1708.0686

arXiv.org e-Print Archive

Crossref

Collaborative Data Analysis and Discovery for Cyber Security

Author: Adam Kearns
Cody Fulcher
Diane Staheli
Era Vuksani
Madeline Chmielinski
Raul Harnasch
Stephen Kelly
Vincent Mancuso
Publication venue
Publication date: 31/03/2020
Field of study

ABSTRACT In this paper, we present the Cyber Analyst Real-Time Integrated Notebook Application (CARINA). CARINA is a collaborative investigation system that aids in decision making by co-locating the analysis environment with centralized cyber data sources, and providing next generation analysts with increased visibility to the work of others. In current generation cyber work, tools limit analyst's ability to collaborate, often relying on individual record keeping which hinders their ability to reflect on their own work and transition analytic insights to others. While online collaboration technologies have been shown to encourage and facilitate information sharing and group decision making in multiple contexts, no such technology exists today in cyber. Using visualization and annotation, CARINA leverages conversation and ad hoc thought to coordinate decisions across an organization. CARINA incorporates features designed to incentivize positive information-sharing behaviors, and provides a framework for incorporating recommendation engines and other analytics to guide analysts in the discovery of related data or analyses. In this paper, we present the user research that informed the development of CARINA, discuss the functionality of the system, and outline potential use cases. We also discuss future research trajectories and implications for cyber researchers and practitioners

CiteSeerX

Eventpad: Rapid malware analysis and reverse engineering using visual analytics

Author: Best Daniel
Cappers Bram C.M.
Etalle Sandro
Gove Robert
Kohlhammer Jorn
Meessen Paulus N.
Paul Celeste Lyn
Prigent Nicolas
Sauer Graig
Staheli Diane
Trent Stoney
Van Wijk Jarke J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/05/2019
Field of study

Forensic analysis of malware activity in network environments is a necessary yet very costly and time consuming part of incident response. Vast amounts of data need to be screened, in a very labor-intensive process, looking for signs indicating how the malware at hand behaves inside e.g., a corporate network. We believe that data reduction and visualization techniques can assist security analysts in studying behavioral patterns in network traffic samples (e.g., PCAP). We argue that the discovery of patterns in this traffic can help us to quickly understand how intrusive behavior such as malware activity unfolds and distinguishes itself from the rest of the traffic.In this paper we present a case study of the visual analytics tool EventPad and illustrate how it is used to gain quick insights in the analysis of PCAP traffic using rules, aggregations, and selections. We show the effectiveness of the tool on real-world data sets involving office traffic and ransomware activity

Assessing a Decision Support Tool for SOC Analysts

Author: Abdullah Kulsoom
Ahlberg Christopher
Ahrend Jan M.
Bertin Jacques
Best Daniel
Burke Johnson R.
Business Process Modelling BPMN.
Byrd Donald
Chu Matthew
Creese Sadie
Crouser R. Jordan
Gancarz Kyle
Jajodia Sushil
Liao Qi
Marty Raffael
Marty Raffael
NAGIOS.
Plaisant Catherine
Rasmussen Jamie
Shneiderman B.
Staheli Diane
Sutcliffe Alistair G.
Trafton J. Gregory
Yelizarov Anatoly
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref